Marathi Handwritten Recognition System Using Machine Learning Techniques

Authors: Mrs. Chaitrali J. Chougule, Mrs. Vasifa S. Kotwal

DOI Link: https://doi.org/10.22214/ijraset.2026.77326

Abstract

The Marathi language presents a considerable barrier in handwriting recognition because of the wide range of writing styles and intricate script. Systems for correctly recognizing handwritten Marathi text can be developed with the aid of machine learning techniques. The official language of Maharashtra, Marathi, originates from Devanagari script. In the globe, it is fifteenth most spoken language, in India, it ranks fourth. The Marathi language is written using Devanagari script, which includes 36 consonants and 12 vowels. Recognizing handwritten characters in any script is a difficult issue for researchers. These days, the most difficult issue is identifying handwritten Marathi characters. Physical document sharing takes a lot of effort and time. Handwritten Marathi characters differ in their shape, structure, writing styles, and number of strokes. The Marathi handwriting recognition technique is crucial in many ways, the safeguarding of cultural heritage. The literacy legacy of Marathi, an old language, is extensive. Through the digitization of handwritten Marathi literature and documents, technology contributes to the continuation and protection of Marathi culture and legacy for future generations. People who are blind or have trouble using text entry methods can more easily access Marathi information because of the recognition system, which promotes accessibility. This system recognizes and transcribes characters and words from handwritten Marathi script using machine learning techniques like deep learning, convolutional neural networks, and bidirectional long short term memory (BLSTM). Usually, it starts with a training phase in which the system picks up skills from a dataset of handwritten Marathi samples extracted from standard datasets. It then uses skills to identify and translate new handwritten input.

Introduction

The text discusses the importance of developing a Marathi Handwritten Recognition System using machine learning and deep learning techniques. Digitizing handwritten Marathi content can significantly enhance access to historical manuscripts, literature, and handwritten documents, while promoting regional language usage on digital platforms. Such systems can support applications in document processing, data entry, customer service, and business services targeting Marathi-speaking populations.

The study reviews various machine learning and deep learning approaches for handwritten recognition, particularly for Devanagari script used in Marathi, Hindi, Sanskrit, and related languages. Traditional classifiers such as Decision Tree, KNN, Random Forest, and Extra Trees have been evaluated, with Random Forest and Extra Trees often showing superior performance. However, modern deep learning methods—especially CNN, RNN, BLSTM, CRNN, and hybrid architectures—demonstrate higher accuracy in handling variations in handwriting styles.

The literature highlights major challenges including:

Variability in handwriting styles
Noise in scanned or camera-captured images
Lack of standardized datasets
Difficulty in recognizing modified characters and half-characters
Segmentation complexity due to overlapping characters

To address these issues, the proposed system includes six modules:

Data Collection & Preparation – Gathering diverse Marathi handwritten samples and splitting them into training, validation, and test sets.
Preprocessing & Segmentation – Image enhancement, noise removal, normalization, and dividing text into lines, words, and characters.
Feature Extraction – Using techniques like HOG and CNN-based feature extraction to capture script characteristics.
Training Module – Combining CNN with Bidirectional LSTM (BLSTM) to capture spatial and sequential dependencies for accurate recognition.
Recognition & Post-Processing – Classifying text and applying contextual and linguistic corrections.
Text-to-Speech Conversion – Converting recognized text into speech using APIs such as Google Text-to-Speech.

Conclusion

Creating a Marathi handwriting recognition system using machine learning offers substantial advantages in terms of accuracy, efficiency, customization, scalability, and automation. Machine learning algorithms excel in identifying intricate patterns and features in handwriting that may challenge human perception, thereby facilitating the creation of highly precise handwriting recognition systems. These systems streamline the process of transcribing handwritten documents, significantly reducing time and effort compared to manual transcription. The development methodology for a Marathi handwriting recognition system using machine learning includes several critical steps: data preprocessing, feature extraction, model training, evaluation, optimization, and deployment. These steps involve preparing and refining the data, extracting relevant features that characterize handwriting styles, training a machine learning model on a dataset of handwritten samples, assessing its performance, optimizing its parameters for enhanced accuracy, and finally, deploying it in a production environment for real-time handwriting recognition. A Marathi handwriting recognition system utilizing machine learning offers a versatile tool for various applications involving the processing of handwritten documents and data. These include tasks such as digitizing historical documents, real-time recognition of handwritten notes, and improving accessibility for individuals with disabilities.

References

[1] George Retsinas, Giorgos Sfikas, Basilis Gatos, and Christophoros Nikou, “Best Practices for a Handwritten Text Recognition System”,arXiv:2404.11339v1, 17 April 2024 [2] Ambadas Shinde, Yogesh Dandawate, “Convolutional Neural Network Based Handwritten Marathi Text Recognition” Journal of Xidian University https://doi.org/10.37896/jxu14.8/058 VOLUME 14, ISSUE 8, 2020. [3] Raphaela Heil, Malin Nauwerck, “Handwritten Stenography Recognition and the LION Dataset”, Springer Nature, arXiv:2308.07799v1, 15 Aug 2023. [4] Shilpa Mangesh Pande, Bineet Kumar Jha, “Character Recognition System for Devanagari Script Using Machine Learning Approach” Proceedings of the Fifth International Conference on Computing Methodologies and Communication (ICCMC_2021) IEEE Xplore Part Number: CFP21K25-ART. [5] Yash Gurav, Priyanka Bhagat, Rajeshri Jadhav, “Devanagari Handwritten Character Recognition using Convolutional Neural Networks” Proc. of the 2nd International Conference on Electrical, Communication and Computer Engineering (ICECCE) 3 June 2020, Istanbul, Turkey.978-1-7281-7116-6/20/$31.00 ©2020 IEEE. [6] Anupama Thakur, Amrit Kaur, “Devanagari Handwritten Character Recognition Using Neural Network” INTERNATIONAL JOURNAL OF SCIENTIFIC & TECHNOLOGY RESEARCH VOLUME 8, ISSUE 10, OCTOBER 2019 ISSN 2277-8616-1736-IJSER©2019. [7] Mimansha Agrawal , Bhanu Chauhan & Tanisha Agrawal, “Machine Learning Algorithms for Handwritten Devanagari Character Recognition: A Systematic Review” 2022 Journal of Science and Technology, DOI:https://doi.org/10.46243/jst.2022.v7.i01.pp01-16 [8] Harmandeep Kaur, Munish Kumar, “On the recognition of offline handwritten word using holistic approach and AdaBoost methodology” 30 May 2020 Multimedia Tools and Applications (2021) 80:11155–11175, https://doi.org/10.1007/s11042-020-10297-7 [9] Sheikh Mohammad Jubaer, Nazifa Tabassum, Md. Ataur Rahman, Mohammad Khairul Islam, “BN-DRISHTI: Bangla Document Recognition through Instance-level Segmentation of Handwritten Text Images”,arXiv:2306.09351v1, 31 May 2023. [10] Shalini Puria, Satya Prakash Singh, “An efficient Devanagari character classification in printed and handwritten documents using SVM” International Conference on Pervasive Computing Advances and Applications – PerCAA 2019 Published by Elsevier Ltd. [11] Manoj Sonkure, Roopam Gupta, and Asmita Moghe, “An efficient approach for Handwritten Devanagari Script Recognition”, October 12, 2020 EasyChair Preprint no. 4370. [12] Sarayut Gonwirat and Olarik Surinta, “Cycle Augment: Efficient data augmentation strategy for handwritten text recognition in historical document images”, February 25, 2022, Engineering and Applied Science Research, https://www.tcithaijo.org/index.php/easr/index. [13] Dr. D. Saraswathi, Sanaa Mohamed Sherif, “Handwritten Text Recognition System using Machine Learning”, Kristu Jayanti Journal of Computational Sciences Volume-1 2021. [14] Evans Ehiorobo, Rukayat Koleoso, and Charles Uwadia, “Training of Offline Handwritten Text Recognizers using Computer-Generated Text”, 2020, Elsevier Ltd. [15] Rohini Khalkar, Adarsh Singh Dikhit, Anirudh Goel, Manisha Gupta and Sheetal Patil, “Deep Learning for Handwritten Text Recognition (ConvNet & RNN)”, Turkish Online Journal of Qualitative Inquiry (TOJQI), Volume 12, Issue 8, July 2021: 1874 – 1890. [16] Sri. Yugandhar Manchala, Jayaram Kinthali, Kowshik Kotha, Kanithi Santosh Kumar, Jagilinki Jayalaxmi, “Handwritten Text Recognition using Deep Learning with TensorFlow”, IJERT, ISSN: 2278-0181, Vol. 9 Issue 05, May-2020. [17] Jamshed Memon, Maira Sami, Rizwan Ahmed Khan And Mueen Uddin, “Handwritten Optical Character Recognition: A Comprehensive Systematic Review(SLR)”, 28 August 2020, IEEE Access, DOI:10.1109/ACCESS.2020.3012542. [18] M. Rajalakshmi, P.Saranya, P. Shanmugavadivu, “Pattern Recognition- Recognition of Handwritten Document using Convolutional Neural Networks”, 978-1-5386-9543-2/19/$31.00 ©2019 IEEE. [19] I Joe Louis Paul, S Sasirekha, D Raghul Vishnu, K Surya, “Recognition of Handwritten Text using Long Short Term Memory (LSTM) Recurrent Neural Network (RNN)”, AIP Conference Proceedings 2095, 030011 (2019); https://doi.org/10.1063/1.5097522. [20] Hao Zeng, “An Off-Line Handwriting Recognition EmployingTensorflow”,978-1-7281-6499-1/20/$31.00©2020 IEEE, DOI:10.1109/ICBAIE49996.2020.00040.

Copyright

Copyright © 2026 Mrs. Chaitrali J. Chougule, Mrs. Vasifa S. Kotwal. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Download Paper

Paper Id : IJRASET77326

Publish Date : 2026-02-06

ISSN : 2321-9653

Publisher Name : IJRASET

DOI Link : Click Here